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Most documentation of computer programs can be summed up in the phrase, “Even 
when it’s good, it’s bad.” Management may occasionally give documentation token priority, 
but programmers seem to give it no priority at all, perhaps because of their training. Pro- 
grammer training is either formal or informal. In formal training courses, documentation is 
usually not a standard part of the curriculum; in informal or on-the-job training, it is usually 
not even mentioned. This lack of training is a basic reason for the problem of documentation, 
a problem that is compounded whenever management deemphasizes program documentation 
simply because past experience has shown that what had been produced was generally 
ineffective. 

The chief reason that documentation is so poor may be that it has been considered a 
manual process when it should have been considered a computer problem. Certainly, no one 
considers compiling a manual process today, although, years ago, compiler functions were 
performed manually. 

The need for documentation seems to be obvious. The primary concerns of both man- 
agers and programmers are program productivity, debugging, flexibility, integration, and 
reliability. Good documentation helps to fulfill these purposes; poor documentation, on the 
other hand, does not. Any organization can obtain good documentation, either manual or 
automatic, if it concentrates on program organization rules; programming standards, includ- 
ing the naming of tagged lines, proper commentary, modular programming, and restrictions 
in the use of certain programming techniques; program monitoring and security, including 
systematic recording of changes in programs, systematic recording of reasons for changes, 
and protection of programs; technical overviews of programs (using tape recordings, if pre- 
ferred); and parallel development of programs and documentation. 

Program organization rules are important because, although good programmers have an 
organized approach to writing programs, they, unfortunately, usually develop styles of their 
own. Rarely will two programmers use the same organization. Because a programmer does 
not work on a program forever, it is obvious that organization should not be permitted to 
suffer from the idiosyncrasies of the individual programmer. The same can be said for pro- 
gramming standards, which, by definition, can be effective only if they are both universally 
published and observed. 

If programmers followed consistent program organization rules and programming stand- 
ards, much of today’s documentation problem would not have arisen. The computer industry 
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is almost 20 years old; it should stop philosophizing about what ought to be and resolve this 
unsatisfactory situation. 

Only automated documentation of programs offers any hope for realizing what may be 
called “accurate” program documentation. This paper will discuss how to improve automated 
documentation and, specifically, how the AUTOFLOW system can be enhanced to provide 
acceptable levels of documentation. 

Given that programmers may cooperate only to a limited extent in documenting their 
programs and that computer programs can be developed to generate information that could 
not be produced manually, the following three elements are essential for an integrated docu- 
mentation system within the framework of today’s data processing environment: 

( 1 ) Logical analysis or graphic dissection of a program 

(2) History and control of programs 

(3) An understanding of the program 

A flowchart produced by AUTOFLOW is much more meaningful than one that has 
been produced manually. These logical flowcharts are accurate, present complete references 
between all transfer points, and graphically portray the logical flow by automatic rearrange- 
ment of those segments of the program that interact. Figure 1 is an example of a two- 
dimensional AUTOFLOW flowchart from a FORTRAN program. 

The number and type of cross-referenced reports produced by AUTOFLOW depend on 
the source language being used. For COBOL, AUTOFLOW can produce four special reports: 
procedure division summary, data name cross-reference listing, data division index, and data 
record map. For PL/I, four special reports are produced: on-unit action blocks, label- 
assignment cross-reference, duplicate declaration map, and condition prefix map. For 
FORTRAN, the one special report is the nonprocedural statements listing. Other special re- 
ports for FORTRAN could be produced by AUTOFLOW and would be of great value. Fig- 
ures 2 through 1 0 are hypothetical reports that could be produced from a FORTRAN pro- 
gram by systems such as AUTOFLOW. 

Figure 2 illustrates the header information that is common to all reports. The informa- 
tion includes the general title, FORTRAN analysis report; the user name, e.g., Goddard Space 
Flight Center; and the system. The run time for the analysis and the data are also presented. 
The report itself is essentially a listing of the local variables used by the program. The infor- 
mation presented is the mnemonic label, the type of variable, the definition of the variable, 
the line number where it is defined, the type and value of the definition, and then the ref- 
erences made by other statements in the FORTRAN source program to the local variable. 

References in all reports consist of the source line number and, in parentheses, the 
AUTOFLOW page and box number. The variable labels in the first column are sorted alpha- 
numerically. The label types are standard for IBM FORTRAN (integer 2, integer 4, real 4, 
real 8, logical, etc.). The DECLARATIONS column specifies where and how the variable is 
defined (i.e., through a data statement or an equivalence statement). If the variable is defined 
by a data statement, the value of the definition will be shown. Doubly-defined variables 
would be indicated by the notation DD in the definition area. 

Figure 3, a cross-reference of statement numbers, lists only those statements that can 
be referenced by other statements within a program, i.e., statements with statement num- 
bers. The appropriate line number, flowchart location, and type of statement (e.g., format, 
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Figure 1.— AUTOFLOW flowchart for FORTRAN program. 
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Figure 1 (continued).-AUTOFLOW flowchart for FORTRAN program. 
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Figure 1 (continued).— AUTOFLOW flowchart for FORTRAN program. 
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Figure 1 (continued).-AUTOFLOW flowchart for FORTRAN program. 
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Figure 1 (continued).— AUTOFLOW flowchart for FORTRAN program. 
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Figure 1 (continued).— AUTOFLOW flowchart for FORTRAN program. 
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Figure 1 (concluded).— AUTOFLOW flowchart for FORTRAN program, 
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Figure 2.— Header information. 
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Figure 3.-Cross-reference of statement numbers. 
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computational, or assignment), are specified. Again, all references to each statement number 
are listed by line number and AUTOFLOW page and box references. 

Figure 4 is a cross-referenced listing of global variables used by the specific program 
that is being analyzed. This report is very similar to the local variable report, except that it 
lists only those variables that reside in blank or labeled common data areas. The information 
presented in the report includes the label mnemonic, the type of label, its definition, data 
used in the label, and all references to the label by other statements in the FORTRAN source 
program. The label type is broken down not only by data type (integer, real, logical, etc.) 
but also by the type of common area (whether it is blank common or label common and, if 
label common, by the mnemonic name of the label common area). 

Figure 5 is a summary of all of the variables used in all of the programs input to a 
single AUTOFLOW run and is similar to the local variable report for a specific FORTRAN 
program. It contains essentially the same kind of information presented in the local variable 
report, mnemonic label for a variable, the type of variable, the definition of the data for the 
variable, and all references to that variable. The unique aspect of this report is that it does 
not reference only those local program variables that are accessible within a specific program 
but rather those variables that can be passed between programs through a common data area. 
In the references column, program identification, line number, and AUTOFLOW page and 
box number are indicated. 

Figure 6 is the program subroutine usage report. This presents the names of subroutines 
within an individual program, the call parameters that are used by or passed to the subrou- 
tine, and any references (by line number and AUTOFLOW page and box number) to that 
subroutine in the specific FORTRAN program being analyzed. In the call parameter area, 
the variable name that is being passed to the subroutine and some additional information 
are found. If a global variable is being passed to a subroutine for its own use, an ampersand 
is appended to the mnemonic label in the CALL statement. A second type of variable that 
may be passed is a dummy variable, one that is not directly used by the program. This is a 
variable that has been passed to the present subroutine by a calling subroutine. A dummy 
variable is indicated by the pound sign appended to it. A third parameter is a return address, 
indicated by an asterisk. The call parameter portion of the listing also specifies the levels of 
all variables that are local to the program. 

Figure 7, the system subroutine usage report, is very similar to the program subroutine 
usage report. The name of the program containing the call, the subroutine name, and the 
local, global, and dummy parameters passed to the called subroutines are specified. The re- 
port summarizes all subroutine usage within all program modules processed in a single 
AUTOFLOW run. Briefly, this listing establishes the hierarchy of subroutine calls among 
the modules for a given execution. 

Figure 8 is the DO loop analysis report for a specific program. This listing indicates the 
complexity of the DO loop control within the program. The body of the report presents the 
source and flowchart locations of the start of the loop, the variables used for starting and 
ending values, and the increment used for the variable counter. 

The complexity map, a bar diagram constructed of X’s, depicts the logical structure of 
DO loops in a histogram format. This histogram graphically portrays the nesting effect. 
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Additional information, such as an exit from within a loop to a statement external to the 
loop, is also shown by the histogram. A nest of three loops is represented by three vertical 
bars. The longest bar represents the initial DO loop, the next longest represents the second- 
level loop, and the shortest represents the third-level loop. The second part of this listing 
is the DO loop analysis summary, which specifies the loop level and the number of loops of 
different levels used in the program. 

Figure 9 is the assigned GO TO analysis by program. This listing presents the sequence, 
page and box numbers, and statement number of all the assigned GO TO statements in a 
FORTRAN program. Additionally, variable names used in the branch list for each assigned 
GO TO are presented. The right side of the report lists all references to particular assigned 
GO TO statements. If one of the variables in the branch list is not defined within the pro- 
gram, this variable name will be listed with a dollar sign indicator. This is particularly help- 
ful since undefined variables used in assigned GO TO statements will result in unpredictable 
destinations for the branch. The logic analysis section of this report presents program condi- 
tions that are probable program errors (e.g., undefined labels, unreferenced statements, un- 
defined variables, or transfers into a DO loop). 

Figure 10 is the statement usage and complexity factor report, which presents a weighted 
summary of statement types within a program. On the left side of the report is the state- 
ment type (such as assigned GO TO, computed GO TO, dimension, value, and computational) 
and the number of each type within a program. The listing also contains the information 
needed for the complexity factor analysis. The assigned weight factors and the weighted 
values automatically assigned to the different types of statements. The user may override the 
default values and assign his own weighted factors at execution time. The product of the 
number of statements of a particular type and the weight factor for that type is the usage 
factor. At the bottom of this report is a summary which shows the total number of state- 
ments in the FORTRAN program, the total weight (the sum of all the usage factors), and 
the program complexity (the computed value of the total weight divided by the total num- 
ber of statements). Program complexities range from 0.1 to 0.9. A factor of 0.5 would indi- 
cate that the program is of average complexity. The complexity factor is a useful guide for 
effective programmer assignment. 

HISTORY AND CONTROL OF PROGRAMS 

A program represents a considerable asset to an organization because it is usually 
costly to develop and is used to control functions within an organization ranging from the 
performance of simple accounting operations to the control of space flight programs. 

Many programs have a life span far in excess of 5 years. A case in point is the IBM 650 
program, which was simulated on the IBM 1401 after the IBM 650 was removed. The IBM 
1401 is now being simulated on the IBM 360 and will shortly be simulated on the IBM 370. 
Rumor has it that the IBM 650 program was actually simulating an IBM 604 tabulating 
function. 

Programs survive intact over long periods of time because they are infrequently run 
and, therefore, not economical to reprogram, or nobody really knows their contents (the 
fear factor). In general, today’s software technology is in such a deplorable condition for 
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the latter reason Programs such as The LIBRARIAN, an adjunct to the AUTOFLOW system, 
are available to monitor program activity; produce histories of changes; retain copies of old 
versions of programs; protect programs against unauthorized use; and provide complete 
indexes that give dates of modifications, reasons for changes, and other information neces- 
sary for the orderly maintenance of programs and data. 

UNDERSTANDING THE PROGRAM 

The next questions to be asked concern the function, organization, and reason for 
organization of a program. All these questions can be answered by “picking the brains” of 
the programmer and the designer. 

Given the aversion of most programmers to documentation, the tape recorder can be 
a very effective means of obtaining vital information. It is probably much easier for many 
programmers to sit down and record on a cassette all the details of program development 
than for them to take the time to write everything down. The taped information can be 
easily transcribed and converted to a machine-readable form for input to a system such as 
TEXT EDITOR. This system can be used to produce a finished document for permanent 
retention as the program history and enables a user to specify format, alter content, and 
expedite production of hard-copy documentation with a minimum of manual effort. In 
short, the programmer need only talk about his projects, and a final record of such dis- 
cussions can be automatically produced. 

The final issue that is critical for the overall effectiveness of documentation is whether 
it actually reflects the current status of program development. Outdated documentation can 
be only partially useful at best, and totally misleading at worst. The systems discussed, 
AUTOFLOW, The LIBRARIAN, and TEXT EDITOR, assure all users that the documentation 
will be not only accurate, standardized, and complete but also timely and readily available 
whenever needed. 

CONCLUSION 

In summary, the critical needs in the area of effective program documentation involve 
the integration of normal programming activities with the requirement for more comprehen- 
sive documentation. The ultimate solution to these needs lies in automated documentation 
systems that can reduce clerical effort on the part of the programmer, provide timely and 
accurate documentation whenever needed, analyze program design and structure, expedite 
maintenance and debugging operations, protect source programs from loss or damage, and 
provide an understanding of the program. Computer programs can do this and can do it 
better, faster, and more economically. 

DISCUSSION 

MEMBER OF THE AUDIENCE: I understand that AUTOFLOW is applicable to 
FORTRAN; is it also applicable to other programming languages? 

GOETZ: AUTOFLOW can be applied to all of the major languages in use today, includ- 
ing second-generation programming languages and various types of FORTRAN. 
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MEMBER OF THE AUDIENCE: To your knowledge, does anyone else employ the 
tape recorder in the way that you have discussed, and what benefits does it offer to program- 
ming personnel? 

GOETZ: Although I am certain that it must be used elsewhere, I cannot provide any 
specific organization names. The technique makes it easier for the programmer to record 
information. The information generated is actually of better quality than that which would 
be produced if the programmer were required to write his documentation, since the pro- 
grammer becomes too self-conscious when he is writing. 

MEMBER OF THE AUDIENCE: Do you have any intention of writing a manual de- 
scribing the entire procedure that could be marketed? 

GOETZ: We have no current plans for doing that. 

MEMBER OF THE AUDIENCE: You have mentioned that AUTOFLOW is available 
for several different language systems. Does this diversity also extend to different computers? 

GOETZ: AUTOFLOW is not available for many machines; it is available for the Spectra 
70 series, the Honeywell series, and the IBM 7090 and 360 series. 

MEMBER OF THE AUDIENCE: Is there an extended AUTOFLOW available for the 
CDC 6600? 

GOETZ: No. The AUTOFLOW system is written in assembly language and cannot be 
transferred between machines. No AUTOFLOW was written for the CDC 6600. We do ac- 
cept 6600 programs-assembly language and the various FORTRANS, I believe— but the 
AUTOFLOW system does not operate with them. Also, the extended versions of the FOR- 
TRAN analysis are hypothetical systems that have not yet been constructed. The flowcharts 
and reports used in my paper were manually produced. 

MEMBER OF THE AUDIENCE: What use is made of the tape recorder in the develop- 
ment of the user documentation? 

GOETZ: The program documentation, providing the internal logic of the program, can 
best be obtained with the use of the tape recorder, but the user documentation is some- 
thing quite different. It should be well organized and produced in a more formal way than 
the program documentation. 

MEMBER OF THE AUDIENCE: Do the American National Standards Institute (ANSI) 
flowchart standards constrain the actual communication of information because of restric- 
tions placed on the size and proportion of symbols and the lack of symbols needed to ter- 
minate and then continue a line that is not related to the flow of the data or the logic of 
the program? Since symbols in modem languages can have as many as 30 characters, the 
standards, to a certain extent, inhibit communication because the programmer must limit 
what he says. 

GOETZ: Our current standards do not quite conform to ANSI standards. The width 
of a process box, for instance, must be related to its length, according to ANSI standards, 
but AUTOFLOW will produce a process box of virtually any size, so it could be 50 or 100 
lines long. We are upgrading our system so that it will conform completely to ANSI stand- 
ards, which will restrict or inhibit somewhat the flowchart produced. The user will then 
have the option of having ANSI or AUTOFLOW standards. 

MEMBER OF THE AUDIENCE: Do you consider the ANSI standards to be adequate 
or archaic? 
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GOETZ: We think that they are somewhat archaic, but they are standards, and we are 
willing to conform. Therefore, we are producing the option. 

MEMBER OF THE AUDIENCE: Consider a program that was written without AUTO- 
FLOW in mind. If the program were then analyzed by AUTOFLOW, which would be the 
most useful: analysis portion or the flowchart portion? 

GOETZ: It would depend upon who would be using the report. For the original pro- 
grammer, the analysis portion will suffice in many cases. For debugging and making program 
alterations, the flowchart is especially useful and would probably be a necessary aid if those 
functions were being performed by someone who was not the original programmer. The level 
of the programmer’s training would also be a consideration. 

MEMBER OF THE AUDIENCE: To what extent is AUTOFLOW used to document 
and maintain itself? 

GOETZ: The entire system is written in Assembly language and contains chart codes in 
the comments portion of the program. By putting these chart codes in the program and con- 
sidering what the assembly language coding represents, we obtain very good narrative state- 
ments and comments. The very low personnel turnover that we have reduces considerably 
the need for producing flowcharts for maintenance purposes. 



